在人工智能中,我们经常寻求确定许多变量的未知目标函数$ y = f(\ mathbf {x})$给出有限的例子$ s = \ {(\ mathbf {x ^ {(i)}} ,y ^ {(i)})\} $ with $ \ mathbf {x ^ {(i)}} \以$ d $是一个感兴趣的域名。我们将$ S $称为培训集和最终任务是识别近似于新$ \ MATHBF {x} $近似于此目标函数的数学模型;使用$ t \ neq s $(即,测试模型泛化),设置$ t = \ {\ mathbf {x ^ {x ^ {x ^ {x ^ {x ^ {x ^ {x ^ {x ^ {x ^ {x但是,对于某些应用,主要兴趣是近似于较大的域名$ d'$的未知函数,该域为$ d $。例如,在涉及设计新结构的情况下,我们可能有兴趣最大化$ F $;因此,源自$ S $的模型也应该在$ d'$以$ y $大于$ s $ m $的值概括为$ d'$。从这种意义上讲,AI系统将提供重要信息,可以指导设计过程,例如,使用学习模型作为设计新实验室实验的代理功能。通过结合添加剂样条模型,我们基于持续分数的迭代配合来介绍一种多变量回归的方法。我们将其与Adaboost,内核,线性回归,Lasso Lars,线性支持向量回归,多层感知,随机林,随机梯度下降和XGBoost等方法进行比较。我们基于物理化学特性预测超导体临界温度的重要问题的性能。
translated by 谷歌翻译
The cover is the face of a book and is a point of attraction for the readers. Designing book covers is an essential task in the publishing industry. One of the main challenges in creating a book cover is representing the theme of the book's content in a single image. In this research, we explore ways to produce a book cover using artificial intelligence based on the fact that there exists a relationship between the summary of the book and its cover. Our key motivation is the application of text-to-image synthesis methods to generate images from given text or captions. We explore several existing text-to-image conversion techniques for this purpose and propose an approach to exploit these frameworks for producing book covers from provided summaries. We construct a dataset of English books that contains a large number of samples of summaries of existing books and their cover images. In this paper, we describe our approach to collecting, organizing, and pre-processing the dataset to use it for training models. We apply different text-to-image synthesis techniques to generate book covers from the summary and exhibit the results in this paper.
translated by 谷歌翻译
这项工作使用水果和叶子的图像提出了一个基于学习的植物性诊断系统。已经使用了五个最先进的卷积神经网络(CNN)来实施该系统。迄今为止,模型的精度一直是此类应用程序的重点,并且尚未考虑模型的模型适用于最终用户设备。两种模型量化技术,例如float16和动态范围量化已应用于五个最新的CNN体系结构。研究表明,量化的GoogleNet模型达到了0.143 MB的尺寸,准确度为97%,这是考虑到大小标准的最佳候选模型。高效网络模型以99%的精度达到了4.2MB的大小,这是考虑性能标准的最佳模型。源代码可在https://github.com/compostieai/guava-disease-detection上获得。
translated by 谷歌翻译
肺癌是最致命的癌症之一,部分诊断和治疗取决于肿瘤的准确描绘。目前是最常见的方法的人以人为本的分割,须遵守观察者间变异性,并且考虑到专家只能提供注释的事实,也是耗时的。最近展示了有前途的结果,自动和半自动肿瘤分割方法。然而,随着不同的研究人员使用各种数据集和性能指标验证了其算法,可靠地评估这些方法仍然是一个开放的挑战。通过2018年IEEE视频和图像处理(VIP)杯竞赛创建的计算机断层摄影扫描(LOTUS)基准测试的肺起源肿瘤分割的目标是提供唯一的数据集和预定义的指标,因此不同的研究人员可以开发和以统一的方式评估他们的方法。 2018年VIP杯始于42个国家的全球参与,以获得竞争数据。在注册阶段,有129名成员组成了来自10个国家的28个团队,其中9个团队将其达到最后阶段,6队成功完成了所有必要的任务。简而言之,竞争期间提出的所有算法都是基于深度学习模型与假阳性降低技术相结合。三种决赛选手开发的方法表明,有希望的肿瘤细分导致导致越来越大的努力应降低假阳性率。本次竞争稿件概述了VIP-Cup挑战,以及所提出的算法和结果。
translated by 谷歌翻译
Deep neural networks (DNNs) are vulnerable to a class of attacks called "backdoor attacks", which create an association between a backdoor trigger and a target label the attacker is interested in exploiting. A backdoored DNN performs well on clean test images, yet persistently predicts an attacker-defined label for any sample in the presence of the backdoor trigger. Although backdoor attacks have been extensively studied in the image domain, there are very few works that explore such attacks in the video domain, and they tend to conclude that image backdoor attacks are less effective in the video domain. In this work, we revisit the traditional backdoor threat model and incorporate additional video-related aspects to that model. We show that poisoned-label image backdoor attacks could be extended temporally in two ways, statically and dynamically, leading to highly effective attacks in the video domain. In addition, we explore natural video backdoors to highlight the seriousness of this vulnerability in the video domain. And, for the first time, we study multi-modal (audiovisual) backdoor attacks against video action recognition models, where we show that attacking a single modality is enough for achieving a high attack success rate.
translated by 谷歌翻译
Unmanned aerial vehicle (UAV) swarms are considered as a promising technique for next-generation communication networks due to their flexibility, mobility, low cost, and the ability to collaboratively and autonomously provide services. Distributed learning (DL) enables UAV swarms to intelligently provide communication services, multi-directional remote surveillance, and target tracking. In this survey, we first introduce several popular DL algorithms such as federated learning (FL), multi-agent Reinforcement Learning (MARL), distributed inference, and split learning, and present a comprehensive overview of their applications for UAV swarms, such as trajectory design, power control, wireless resource allocation, user assignment, perception, and satellite communications. Then, we present several state-of-the-art applications of UAV swarms in wireless communication systems, such us reconfigurable intelligent surface (RIS), virtual reality (VR), semantic communications, and discuss the problems and challenges that DL-enabled UAV swarms can solve in these applications. Finally, we describe open problems of using DL in UAV swarms and future research directions of DL enabled UAV swarms. In summary, this survey provides a comprehensive survey of various DL applications for UAV swarms in extensive scenarios.
translated by 谷歌翻译
Compared to regular cameras, Dynamic Vision Sensors or Event Cameras can output compact visual data based on a change in the intensity in each pixel location asynchronously. In this paper, we study the application of current image-based SLAM techniques to these novel sensors. To this end, the information in adaptively selected event windows is processed to form motion-compensated images. These images are then used to reconstruct the scene and estimate the 6-DOF pose of the camera. We also propose an inertial version of the event-only pipeline to assess its capabilities. We compare the results of different configurations of the proposed algorithm against the ground truth for sequences of two publicly available event datasets. We also compare the results of the proposed event-inertial pipeline with the state-of-the-art and show it can produce comparable or more accurate results provided the map estimate is reliable.
translated by 谷歌翻译
With Twitter's growth and popularity, a huge number of views are shared by users on various topics, making this platform a valuable information source on various political, social, and economic issues. This paper investigates English tweets on the Russia-Ukraine war to analyze trends reflecting users' opinions and sentiments regarding the conflict. The tweets' positive and negative sentiments are analyzed using a BERT-based model, and the time series associated with the frequency of positive and negative tweets for various countries is calculated. Then, we propose a method based on the neighborhood average for modeling and clustering the time series of countries. The clustering results provide valuable insight into public opinion regarding this conflict. Among other things, we can mention the similar thoughts of users from the United States, Canada, the United Kingdom, and most Western European countries versus the shared views of Eastern European, Scandinavian, Asian, and South American nations toward the conflict.
translated by 谷歌翻译
The performance of the Deep Learning (DL) models depends on the quality of labels. In some areas, the involvement of human annotators may lead to noise in the data. When these corrupted labels are blindly regarded as the ground truth (GT), DL models suffer from performance deficiency. This paper presents a method that aims to learn a confident model in the presence of noisy labels. This is done in conjunction with estimating the uncertainty of multiple annotators. We robustly estimate the predictions given only the noisy labels by adding entropy or information-based regularizer to the classifier network. We conduct our experiments on a noisy version of MNIST, CIFAR-10, and FMNIST datasets. Our empirical results demonstrate the robustness of our method as it outperforms or performs comparably to other state-of-the-art (SOTA) methods. In addition, we evaluated the proposed method on the curated dataset, where the noise type and level of various annotators depend on the input image style. We show that our approach performs well and is adept at learning annotators' confusion. Moreover, we demonstrate how our model is more confident in predicting GT than other baselines. Finally, we assess our approach for segmentation problem and showcase its effectiveness with experiments.
translated by 谷歌翻译
This paper deals with the problem of statistical and system heterogeneity in a cross-silo Federated Learning (FL) framework where there exist a limited number of Consumer Internet of Things (CIoT) devices in a smart building. We propose a novel Graph Signal Processing (GSP)-inspired aggregation rule based on graph filtering dubbed ``G-Fedfilt''. The proposed aggregator enables a structured flow of information based on the graph's topology. This behavior allows capturing the interconnection of CIoT devices and training domain-specific models. The embedded graph filter is equipped with a tunable parameter which enables a continuous trade-off between domain-agnostic and domain-specific FL. In the case of domain-agnostic, it forces G-Fedfilt to act similar to the conventional Federated Averaging (FedAvg) aggregation rule. The proposed G-Fedfilt also enables an intrinsic smooth clustering based on the graph connectivity without explicitly specified which further boosts the personalization of the models in the framework. In addition, the proposed scheme enjoys a communication-efficient time-scheduling to alleviate the system heterogeneity. This is accomplished by adaptively adjusting the amount of training data samples and sparsity of the models' gradients to reduce communication desynchronization and latency. Simulation results show that the proposed G-Fedfilt achieves up to $3.99\% $ better classification accuracy than the conventional FedAvg when concerning model personalization on the statistically heterogeneous local datasets, while it is capable of yielding up to $2.41\%$ higher accuracy than FedAvg in the case of testing the generalization of the models.
translated by 谷歌翻译